AITopics

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

Neural Information Processing SystemsFeb-12-2026, 21:28:51 GMT

65a39213d7d0e1eb5d192aa77e77eeb7-Paper-Conference.pdf

large language model, machine learning, natural language, (21 more...)

Country:

Asia > Singapore (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report > New Finding (0.67)

Industry: Consumer Products & Services (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Neural Information Processing SystemsOct-8-2025, 19:47:02 GMT

Large Language Models as Commonsense Knowledge for Large-Scale Task Planning Anonymous Author(s) Affiliation Address email Appendix 1 A Experimental environments 2 We use the VirtualHome simulator [

large language model, machine learning, natural language, (21 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

Neural Information Processing SystemsOct-8-2025, 19:46:59 GMT

65a39213d7d0e1eb5d192aa77e77eeb7-Paper-Conference.pdf

large language model, machine learning, natural language, (21 more...)

Country:

Asia > Singapore (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Consumer Products & Services (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Ivanova, Anastasiia, Bakaeva, Eva, Volovikova, Zoya, Kovalev, Alexey K., Panov, Aleksandr I.

AmbiK: Dataset of Ambiguous Tasks in Kitchen Environment

arXiv.org Artificial IntelligenceJun-5-2025

As a part of an embodied agent, Large Language Models (LLMs) are typically used for behavior planning given natural language instructions from the user. However, dealing with ambiguous instructions in real-world environments remains a challenge for LLMs. Various methods for task ambiguity detection have been proposed. However, it is difficult to compare them because they are tested on different datasets and there is no universal benchmark. For this reason, we propose AmbiK (Ambiguous Tasks in Kitchen Environment), the fully textual dataset of ambiguous instructions addressed to a robot in a kitchen environment. AmbiK was collected with the assistance of LLMs and is human-validated. It comprises 1000 pairs of ambiguous tasks and their unambiguous counterparts, categorized by ambiguity type (Human Preferences, Common Sense Knowledge, Safety), with environment descriptions, clarifying questions and answers, user intents, and task plans, for a total of 2000 tasks. We hope that AmbiK will enable researchers to perform a unified comparison of ambiguity detection methods. AmbiK is available at https://github.com/cog-model/AmbiK-dataset.

kitchen table, large language model, machine learning, (19 more...)

2506.04089

Country: Europe (0.92)

Genre: Research Report > New Finding (1.00)

Industry:

Education (0.67)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceFeb-12-2024

LoTa-Bench: Benchmarking Language-oriented Task Planners for Embodied Agents

Choi, Jae-Woo, Yoon, Youngwoo, Ong, Hyobin, Kim, Jaehong, Jang, Minsu

Large language models (LLMs) have recently received considerable attention as alternative solutions for task planning. However, comparing the performance of language-oriented task planners becomes difficult, and there exists a dearth of detailed exploration regarding the effects of various factors such as pre-trained model selection and prompt construction. To address this, we propose a benchmark system for automatically quantifying performance of task planning for home-service embodied agents. Task planners are tested on two pairs of datasets and simulators: 1) ALFRED and AI2-THOR, 2) an extension of Watch-And-Help and VirtualHome. Using the proposed benchmark system, we perform extensive experiments with LLMs and prompts, and explore several enhancements of the baseline planner. We expect that the proposed benchmark tool would accelerate the development of language-oriented task planners.

apple, fridge, kitchen table, (17 more...)

2402.08178

Country: Asia > South Korea > Daegu > Daegu (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Aguinaldo, Angeline, Patterson, Evan, Fairbanks, James, Regli, William, Ruiz, Jaime

A Categorical Representation Language and Computational System for Knowledge-Based Planning

arXiv.org Artificial IntelligenceNov-14-2023

Classical planning representation languages based on first-order logic have preliminarily been used to model and solve robotic task planning problems. Wider adoption of these representation languages, however, is hindered by the limitations present when managing implicit world changes with concise action models. To address this problem, we propose an alternative approach to representing and managing updates to world states during planning. Based on the category-theoretic concepts of $\mathsf{C}$-sets and double-pushout rewriting (DPO), our proposed representation can effectively handle structured knowledge about world states that support domain abstractions at all levels. It formalizes the semantics of predicates according to a user-provided ontology and preserves the semantics when transitioning between world states. This method provides a formal semantics for using knowledge graphs and relational databases to model world states and updates in planning. In this paper, we conceptually compare our category-theoretic representation with the classical planning representation. We show that our proposed representation has advantages over the classical representation in terms of handling implicit preconditions and effects, and provides a more structured framework in which to model and solve planning problems.

countertop, representation, world state, (14 more...)

2305.17208

Country:

North America > United States > New York (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.88)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)

arXiv.org Artificial IntelligenceOct-30-2023

Large Language Models as Commonsense Knowledge for Large-Scale Task Planning

Zhao, Zirui, Lee, Wee Sun, Hsu, David

Large-scale task planning is a major challenge. Recent work exploits large language models (LLMs) directly as a policy and shows surprisingly interesting results. This paper shows that LLMs provide a commonsense model of the world in addition to a policy that acts on it. The world model and the policy can be combined in a search algorithm, such as Monte Carlo Tree Search (MCTS), to scale up task planning. In our new LLM-MCTS algorithm, the LLM-induced world model provides a commonsense prior belief for MCTS to achieve effective reasoning; the LLM-induced policy acts as a heuristic to guide the search, vastly improving search efficiency. Experiments show that LLM-MCTS outperforms both MCTS alone and policies induced by LLMs (GPT2 and GPT3.5) by a wide margin, for complex, novel tasks. Further experiments and analyses on multiple tasks -- multiplication, multi-hop travel planning, object rearrangement -- suggest minimum description length (MDL) as a general guiding principle: if the description length of the world model is substantially smaller than that of the policy, using LLM as a world model for model-based planning is likely better than using LLM solely as a policy.

container, fridge, llm, (16 more...)

2305.14078

Country:

Asia > Singapore (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry: Consumer Products & Services > Travel (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

EngadgetMay-10-2022, 15:00:07 GMT

Inside Microsoft's new Inclusive Tech Lab

On the north campus of Microsoft's 500-acre headquarters, anticipation is quietly mounting. The company is gearing up to launch its new Inclusive Tech Lab, which sits in Building 86 -- one of 125 buildings in its Redmond, Washington grounds. This 2,000-square-foot room used to be a reception area, with a set of doors leading to the offices within and another pair facing the rest of the world. It only seems fitting, considering what Microsoft envisions this lab to be: a place to welcome members of the disability community, the tech industry and its own designers. Across the street is building 88, where you'll find chief product officer Panos Panay's office, while down the road is the Hardware Lab in building 87.

inclusive tech lab, microsoft, romney, (13 more...)

Engadget

Country: North America > United States > Washington > King County > Redmond (0.24)

Industry:

Information Technology (0.34)
Leisure & Entertainment > Games > Computer Games (0.31)

Technology:

Information Technology > Communications (0.70)
Information Technology > Artificial Intelligence (0.48)

Los Angeles TimesDec-5-2021, 13:00:07 GMT

Terrified of COVID, she works at home. He goes to the office. What's a family to do?

He's a certified drug and alcohol counselor who opened a sober living house at the peak of last winter's deadly COVID-19 surge and is on-site at least six days a week. She works for a production company, colonized their kitchen table for her two outsize computer monitors and has stayed largely locked up in their 600-square-foot Mar Vista apartment, where they now dine on TV trays. "When L.A. was, like, the worst place on Earth for COVID, I was going out and looking at three houses a day," scouting locations for Hyperion Sober Living, said co-owner Jack Shain. Shain's job means he's out in the world nearly every day, where it's impossible to tell the vaccinated from the sick. Cara Ferraro's allows her to stay home with the cats, her anxiety and the ever-present pile of dishes in the sink.

apartment, ferraro, shain, (17 more...)

Los Angeles Times

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.05)
North America > United States > North Carolina (0.05)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.70)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.52)

Technology: Information Technology > Artificial Intelligence (0.35)